About the company
ChainSafe is a leading blockchain research and development firm specializing in infrastructure solutions for the decentralized web. Alongside its contributions to significant ecosystems such as Ethereum, Polkadot, Filecoin, Mina, and more, ChainSafe creates solutions for developers and teams across the web3 space utilizing our expertise in gaming, bridging, NFTs, and decentralized storage. As part of the mission to build innovative products for users and better tooling for developers, ChainSafe embodies an open-source and community-oriented ethos.
Job Summary
Responsibilities
📍Ensure reliable operation of the company’s distributed Relayer nodes operations across various blockchain networks (EVM, Substrate, Cosmos SDK) while adhering to internal SLAs and committed KPIs 📍Design and implement procedures related to Sygma’s Relayer node operations (deployment and upgrade, incident response, and key management) 📍Build monitoring and observability for various Sygma services including a distributed set of relayers and various blockchain full nodes. 📍Provide training and guidance for other members of the infrastructure team, ensuring round-the-clock node operation and incident response. 📍Document and communicate technical details via open-source documentation 📍Collaborate with various internal teams and the wider community to build, expand, and scale Sygma’s architecture, by tapping into new trends and opportunities highlighted by internal data, blockchain research, and the wider blockchain ecosystem
Requirements
đź“ŤSolid dev. experience with Golang đź“ŤExperience working with AWS services đź“ŤDemonstrable experience with modern Infrastructure as Code (IaC) tools (Terraform, Helm, Ansible, etc), automating deployment, and best CI/CD practices and tools. đź“ŤExperience with monitoring and alerting tools (DataDog, Grafana, Prometheus, etc.) đź“ŤExperience implementing distributed tracing, monitoring, and logging systems using OpenTelemetry Protocol đź“ŤExperience building and participating in incident response systems (PagerDuty, etc) and handling the emergency response to production environment failures. đź“ŤIn-depth knowledge of distributed systems and blockchain technology. đź“ŤExcellent communication skills with the ability to document and convey technical details clearly đź“ŤAbility to work autonomously as well as with the wider team